Streamed Learning: One-Pass SVMs
نویسندگان
چکیده
We present a streaming model for large-scale classification (in the context of l2-SVM) by leveraging connections between learning and computational geometry. The streaming model imposes the constraint that only a single pass over the data is allowed. The l2-SVM is known to have an equivalent formulation in terms of the minimum enclosing ball (MEB) problem, and an efficient algorithm based on the idea of core sets exists (CVM) [Tsang et al., 2005]. CVM learns a (1+ε)-approximate MEB for a set of points and yields an approximate solution to corresponding SVM instance. However CVM works in batch mode requiring multiple passes over the data. This paper presents a single-pass SVM which is based on the minimum enclosing ball of streaming data. We show that the MEB updates for the streaming case can be easily adapted to learn the SVM weight vector in a way similar to using online stochastic gradient updates. Our algorithm performs polylogarithmic computation at each example, and requires very small and constant storage. Experimental results show that, even in such restrictive settings, we can learn efficiently in just one pass and get accuracies comparable to other stateof-the-art SVM solvers (batch and online). We also give an analysis of the algorithm, and discuss some open issues and possible extensions.
منابع مشابه
Single-Pass Distributed Learning of Multi-class SVMs Using Core-Sets
We explore a technique to learn Support Vector Models (SVMs) when training data is partitioned among several data sources. The basic idea is to consider SVMs which can be reduced to Minimal Enclosing Ball (MEB) problems in an feature space. Computation of such SVMs can be efficiently achieved by finding a coreset for the image of the data in the feature space. Our main result is that the union ...
متن کاملGuest editorial : Thematic issue on ‘ Adaptive Soft Computing
Personalized Transductive Learning (PTL) builds a unique local model for classification of each test sample and therefore is practically neighborhood dependant, i.e. a specific model is built in a subspace spanned by a set of samples adjacent to the test sample. While existing PTL methods usually define the neighborhood by a predefined (dis)similarity measure, in this paper we introduce a new c...
متن کاملSpanning SVM Tree for Personalized Transductive Learning
Personalized Transductive Learning (PTL) builds a unique local model for classification of each test sample and therefore is practically neighborhood dependant. While existing PTL methods usually define the neighborhood by a predefined (dis)similarity measure, in this paper we introduce a new concept of knowledgeable neighborhood and a transductive SVM classification tree (t-SVMT) for PTL. The ...
متن کاملImage watermarking method in multiwavelet domain based on support vector machines
0164-1212/$ see front matter Crown Copyright 2 doi:10.1016/j.jss.2010.03.006 * Corresponding author. Address: School of Electron Electronic Science and Technology of China, Chengdu E-mail address: [email protected] (H. Peng). A novel image watermarking method in multiwavelet domain based on support vector machines (SVMs) is proposed in this paper. The special frequency band and property of image in ...
متن کاملDual coordinate solvers for large-scale structural SVMs
This manuscript describes a method for training linear SVMs (including binary SVMs, SVM regression, and structural SVMs) from large, out-of-core training datasets. Current strategies for large-scale learning fall into one of two camps; batch algorithms which solve the learning problem given a finite datasets, and online algorithms which can process out-of-core datasets. The former typically req...
متن کامل